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Abstract. Sample covariance matrix and multivariate F- matrix play important roles in multivariate 
statistical analysis. The central limit theorems (CUT) of linear spectral statistics associated with 



,-£h , these matrices were established in Bai and Silverstein (2004) and Zheng (2012) which received 

considerable attentions and have been applied to solve many large dimensional statistical problems. 
However, the sample covariance matrices used in these papers are not centralized and there exist some 
questions about CLT's defined by the centralized sample covariance matrices. In this note, we shall 

■ provide some short complements on the CLT's in Bai and Silverstein (2004) and Zheng (2012), and 
CO . 

( show that the results in these two papers remain valid for the centralized sample covariance matrices, 

■ provided that the ratios of dimension p to sample sizes (n, 711,712) are redefined as p/(n — 1) and 



p/(rii — 1), i = 1, 2, respectively. 

Key words and phrases. Linear spectral statistics, central limit theorem, centralized sample covari- 



. ance matrix, centralized F-matrix, simplified sample covariance matrix, simplified ^-matrix. 

1 Introduction 

Let {Xjk,j,k — 1, 2, ■ • • } and {Yjk,j,k = 1,2, ■•■} be two independent double arrays of independent random 
variables, either both real or both complex. In the sequel, we use A* to denote a complex conjugate transpose of a 
vector or matrix A. For p > 1, n > 1 and N > 1, we define X = (Xi, • • • , X n ) and Y = (Yi, ■ • ■ , Yjv) with column 
vectors Xj = (Xji, Xj p )' , 1 < j < n, and Yfc = (Yfci, Ykp)', 1 < k < N . Let T p be a p x p non-negative definite 
(nnd) matrix. There exists a unique nnd matrix Tp^ 2 such that T p = (T^ 2 ) 2 . Then, (Tp^Xi, • ■ • ,Tp^ 2 Xn) and 
(Tp^Yi, • ■ • , Ty 2 Yjv) can be considered as two independent samples of sizes n and N, respectively, drawn from 
a p-dimensional population with population covariance matrix T p . 
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It is well known that the sample covariance matrices for T 



'p /2 X and T P /2 Y 



are often defined as 



S„ = ^-j ( J2 Ty 2 X»X*Ty 2 - ?iT P /2 XX*T P /2 



Sy = ( Ty 2 YiY*Ty 2 - TVTp^YY'Tp 





(1.1) 



(1.2) 



n _ JV 

respectively, where X = i ^ Xj and Y = Yi. The multivariate F-matrbjj is then defined as 



F — S^Sy 



-i 



(1.3) 



Notice that the matrices defined in (|1.1[> - (|1.3[> are transformation invariant, we will call them centralized sample 
covariance matrices and multivariate F-matrix, respectively. 

Due to Corollary A. 41 and Theorem A. 43 of Bai and Silverstein (2009), in the literature of random matrix 
theory, the sample covariance matrices are usually simplified as 



and the multivariate F-matrix is simplified as G = B^B^ . 

Bai and Silverstein (2004) considered the central limit theorem (CLT) of the linear spectral statistics (LSS) 
of the simplified sample covariance matrix B^ and provided the explicit expressions of asymptotic means and 
covariance functions for B,. Later, Zheng (2012) extended the work of Bai and Silverstein (2004) to the case of the 
multivariate F-matrix G and obtained explicit expressions of the asymptotic means, variances, and covariances for 



Examining the inequalities derived from Corollary A. 41 and Theorem A. 43 of Bai and Silverstein (2009), one 
finds that the difference between the empirical spectral distributions (ESD) of S x and B x is of the order 0(n _1 ). 
Hence, we conclude that S x and B^ have the same limiting spectral distributions (LSD). However, the scale 
normalizers in CLT's of LSS of random matrices S x and B^ have the same order as p. Thus, it is expected that 
the asymptotic biases in the CLT's of LSS of S x and B^ should have a little difference. Upon such a consideration, 
Pan (2012) reconsidered the CLT of LSS of centralized sample covariance matrix S x . To reduce the asymptotic 
bias, he added an additional term to that of Bai and Silverstein (2004), that is, 



To guarantee that the definition makes sense, we need to assume that p < N and T p is positive 
definite. Because the eigenvalues of F are independent of T p , we may assume T p is an identity matrix. 




(1.4) 



G. 




(1.5) 
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where m y (z) = — + ym y (z), m v (z) is the Stieltjes transform of the LSD of S x , H(t) is the LSD of T p and 
y n = p/n -> y > 0. 

It is well known that when the population is multivariate-normally distributed, the centralized sample covari- 
ance matrix S x has the same distribution as simplified covariance matrix Bj with sample size n — 1 and population 
mean zero. This fact motivates that this phenomenon should be asymptotically true in the general case. In this 
note, we shall give short proofs to indicate that if the simplified sample covariance matrix Bj is replaced by cen- 
tralized sample covariance matrix S x , Bai and Silverstein (2004) 's result remains valid provided that the ratio of 
dimension to sample size y n is replaced by p/in — 1) (this is equivalent to c n = n/(N — 1) in Bai and Silverstein 
(2004)). This result is equivalent to but much simpler than that of Pan (2012) in both expressions and proof. 
Moreover, we shall prove that if the simplified multivariate F-matrix G is replaced by the centralized F, the re- 
sults of Zheng (2012) remain valid provided the ratios of dimensions to sample sizes, y n \ and y n 2, are replaced by 
p/(n - 1) and p/(N - 1). 

The remainder of this note is arranged as follows: Section 2 states the main theorems and the proof of Theorem 
12.21 Section 3 gives the proof of Theorem 12. II The technical lemmas and their proofs will be postponed to Section 

HI 

2 Main Results 

As mentioned in the previous section, the centralized covariance matrix will have the same LSD as that of the 
corresponding simplified covariance matrix. In this note, we shall prove the following theorems. 

Theorem 2.1 Assume that 

(a) For each p, {Xij,i < p,j < n} are independent random variables with EXij — 0, -B|Xn| 2 = 1, and 
satisfying 

^ V n 

— '^2'^2 E \ X jk\ 4 'i-{\x jk \> v v^} -> °> for any fixed r] > 0. (2.1) 

nP j=l k=l 

Note that the random variables may be allowed to depend on p, but we suppress this dependence from the notation 
for brevity. 

(b) We assume E\Xij\ i = 3 for the real case, and E\Xij\ i = 2 and EXfj = for the complex case. 

(c) y n = p/n -> y, and 

(d) T p is a p x p non-random nnd Hermitian matrix with bounded spectral norm in p. and its ESD H p — > H 
where H is a proper probability distribution. 
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Let f be an analytic function on an open region in the complex plane which covers the support of LSD of S x 
with the origin excluded. 
Then 

(i) the random variables 

X P (f)= P J f(x)d(F s * -i^«-^>(x)) , (2.2) 

form a tight sequence mp, where F Sx is the ESD of centralized sample covariance matrix S x , F^ V ' H ^ ts the LSD of 
S x whose LSD's Stieltjes transform m y {z) satisfies m y (z) = ym y (z) — (1 — y)/z and m (z) is the unique solution 
to the equation 

z = ~+yf 1 dH(t). (2.3) 
m y J l + tm y (z) 

in the upper half complex plane for each z £ C + = {z : Q(z) > 0}. 

(ii) The random variables in 12. 2\) converges weakly to Gaussian variables Xf with the same means and co- 
variance functions as given in Theorem 1.1 of Bai and Silverstein (2004). 

The proof of Theorem 2.1 is postponed to Section 

As for the CLT of LSS of F matrix, we have the following theorem. 

Theorem 2.2 Assume that 

1. the two arrays {Xjk,j < p,k < n} and {Yj k ,j < p,k < N} satisfy for any fixed n > 0, 

p n p N 

-EE E \^\\ ]Xjk] >r,^} 0, — E E E\Y ]k \\^> v VN } 0. (2.4) 

F j=l k=l 1 j = l fe=l 

2. For all j,k, \EXf k \ — /3 X + 1 + ft, \EY^ k \ = /3 V + 1 + ft. // both X and Y are complex valued, then 
EXf k = EYf k = 0. Moreover, y n = p/n ->■ y 1 > and y N = p/N y 2 £ (0, 1). 

Let f be an analytic function in an open region of the complex plane containing the interval ^~^yi , (T^Tp'J • the 
support of the continuous part of the LSD F y of F -matrix, h — \/yi +~y~2 — Viy2 and y = (j/i, y 2 ). 
Then, as p — s> oo, the random variables 

W P (f)=p J f(x)d[F F (x)-F {yn _ 1 , VN ^ l) (x)) 

converges weakly to Gaussian variables {Wj} which have the same means and covariance functions as given in 
Zheng (2012) ,where F F (x) is the ESD of centralized F -matrix F and F( yi , V2 )(x) is the LSD defined by (2.4) of 
Zheng (2012). 
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Proof. As mentioned in Section 1, we may assume T p to be an identity matrix. Split our proofs into two steps by 
writing 



tr(F-zI p ) 1 -pm {y ^ l!VN _ l) (z) 



tr(S x S y 1 — zl p ) 1 — pm 



+P 



j(y„ i.f y ) (z)-m iyn _ ltyN _ 1 )(z) 



where F s y (t) and F Sy (t) are the ESDs of S y 1 and S y , i7i( yitV2 -j is the Stieltjes transform of the LSD of F matrix, 



+ y n -im 



1 



1 + t a j{»»-l,F B » } 



{Vn-1, ^ » } 



— +2/n-l 



— dF s «{t). 



(2.5) 



Step 1. Given S„ . in the proof of Theorem l2,ll we have proved that the process tr(S^S v : ~ zip) 1 -pm' s "" I,F " '(z) 
weakly tends to a Gaussian process on the contour with mean and covariance function as given in (6.29) and (6.30) 
of Zheng (2012). 

Step 2. By fl23J| and the truth of 



+ Vn-i J 



%»-i,y»-i} ^ * + ^{v»-i,wv-i} 

where F VN _ 1 is the M-P law with ratio of dimension to sample size j/at-i- Subtracting both sides of (|2.5I) from 
those of (12.61). we obtain 



(2.6) 



^-i.f* >(z)-m {SB _ llW _ l} (2) 



-2/n-l™ {y „_, HJV „ l} ? 



= TV- 



l - y n -i ■ J 



-{»n-li»JV-l}' 



,{Vn-l. F y } 



dF N _i(t) 







i (Bn-l.yiV-l}y 



:% P[miy-i(-m {Wn _ 1 , yiy _ l} )-m 1/JV _ 1 (-m {l/>i _ liWJf _ l} ) 



1 - Vn-l ■ J 



-lVn-1'VN—l} ' 



Av n -i,F * } 



dF N _ 1 (t) 



(*4mc, B _ 1 . w _ l} )-^*- 1 -' B } 

s -i 

-J/n-im {y „_ ll!/N _ l} I2 {! '"- 1 ' F * K P[^iv-i(-ni {y „_ 1 , HN _ l} )-m HN _ 1 (-m {i/ _ ii!/N _ i} )] 



(2.7) 



1 - J/n-l ■ / ' 



AVn — liVN-l} 



AVn-1. F y > 



dF N _ 1 (t) 
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which weakly tends to a Gaussian process on the contour with mean and covariance function as given in (6.33) and 
(6.34) of Zheng (2012) where m N -! is the Stieltjes transform of ESD F N -i(x) of S„, z = -— - h and 

— Mjv-1 — VN — 1 

m (?) = — 1 v ^~ 1 + ym yN1 (z). By Theorem 2.1 and (|2.7[) we obtain that 

tr(F - zip)' 1 - pm {yn _^ yN _ l} (z) and tr (G - zip) -1 - pm{ yni y N y(z) 

have the same asymptotic distribution. Hence, the random variables 

(p J f(x)d(F F (x) - F {yn _ uyN _ l} (x))) 

converges weakly to Gaussian variables Wj with the same means and covariance functions as Zheng (2012). 

Then the proof of Theorem 2.2 is completed. ■ 



3 The Proof of Theorem 2.1 



The condition (|2.1[) allows us to truncate the random variables at rj n ^/n and then renormalize them to have means 
zero and vairances 1, where r\ n — > slowly. Note that the 4th moments of the random variables may not be the 
same but they will be k + 1 + j3 x + o(l) and for the complex case we have EXfj = o(n~ 1 ). The contour is defined 
similar to Bai and Silverstein (2004). Define ~f i = -4=Tp' 2 Xj. Then, we have 



n n ^ 

s - = X)fr< - "rX-ft - -t)* = X) 7i7i - —[ 7,7; 



j = Ba- — A 

^3 
n 

where 7 = ± X] 7. and A = ^ 7j7fc • Because 

i — 1 

(S* - ziy 1 = (A - A)" 1 = A _1 + (A - A)" 1 A (A)" 1 = A" 1 +A" 1 AA" 1 +A" 1 (AA" 1 ) 2 + (A - A)" 1 (AA" 1 )' 
where A(z) = — zl, then we obtain 



P [^tr(B x -zI)- 1 -m° n _ 1 (z) 
P 



P 



(|ir(A - A)- 1 - m° n {z) + m° n (z) - mS-i(*)) 
(itrA- 1 ^) - «£(*)) - m°_i(«)) + trA- 2 (z)A 



(3.1) 



+trA- 1 (z)(AA- 1 (z)) 2 + tr {A(z) - A)" 1 (AA^ 1 ^)) 3 
where m°(z) = + ?Mm° (2), m° (z) and m".^) satisfy 



+ ~ f : 1 (m dH p {t) (3.2) 



nj 1 + tm (0) 



-dH p (t) 



in 



z = -—+y f — dH(t). 



By (|3.1[) . Lemmas 14. II and 14.61 we have 



tr(B, - ziy 1 -p. mtifz) = p (UrA-\z) - rr£(z)j + o p (l). 



(3.3) 
(3.4) 

(3.5) 



So Theorem 12.11 has the same asymptotic mean and covariance function as Theorem 1.1 of Bai and Silverstein 
(2004). 

Tightness of tr(S^ — zl)^ 1 — tr(Ba; — zT) . Using what has been proved in Bai and Silverstein (2004), we only 
need to prove that there is an absolute constant M such that for any z\,z% £ C, 

EltrCB* - z 1 I)- 1 (B x - Z2I)- 1 - tr(S x - ^I) -1 ^ - 



E 



(A,- — Ai)(Aj + Aj — 2ziz 2 ) 



y 

frt (A, - 2l )(A - z 2 ){\i - zi)(X - z 2 ) 



^lA.-A,] B n + o(l)<M, 



(3.6) 



where {Ai} and {Ai} are the eigenvalues of B x and S x , respectively, and arranged in descending order, the event 
B„ is defined as xi + e < X p < Ai < x r — e such that 

P(B») = o(n" 3 ). 

(for the justification of the definition B n , see Bai and Silverstein (1998)). The last step of (|3.6[) follows from the 
fact that 

2^ I Ai — Xi \ = 2j(Aj — Ai) < Ai — X p < x r 

i=l i=l 

by the interlacing theorem. 

The equi-continuity of Etr(S a , — zl)^ 1 — pm^ l _ 1 (z) can be proved in a similar way to that for the tightness of 
tr(S ;c - 2 I)- 1 -Etr(B ;c -zI)- 1 . 

By now, the proof of Theorem 2.1 is completed. ■ 



4 Technical Lemmas 

Lemma 4.1 Under assumptions of Theorem \2.1l for every z £ C + , we have 

(n\ m\ m „ + zm' 
p m<°> - mi !,) -+ 1 + zrn y ) ■ =S 
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Proof. We have 



(o) / \ n-p 1 p 0/ (0) , , n-l-p 1 p o / x ,x 

(2) = h-m n z), m„_i(z) = -, \--m n ^ 1 {z) (4.1) 

n z n n — 1 z n 



where p/n — > y > 0. By (|3.4p . we obtain 



Using (p^2)) - (|0)l . we obtain 



t 1 + zm„ 

elH-(f) = — y (4.2) 



rf) - m. (0) 



/nT - ^" _ fei 0) ~~ Sln-l)- / FHT" 7m — dHJt) P / . . dHJt), 

^m^K n J ( 1 + tm (0) )(1 + tm (0) ) n(n-l)J l+tm ( °K 



that is, 

n(rn£ ) - m£>_ J = x - r " \ 2 J[; ~7 -> ^— — r ~% JtJ , — ■ ( 4 - 3 ) 

" j (i+4 0) )(i+fmil) pl J ^ y J (i+tm^^^W 

By ((471]), g2| and 03]), we have 



1 + zm(z) 



» 1 + zm H 1 + m(z) _ + zm^ 



+ ' =(l + gBL. )•=« (4.4) 



Thus, we prove that Lemma 14 . 1 1 holds . 

In the sequel, we shall use Vatali lemma frequently. Let 



A = -~y^7;7fc- (It should be — - — ^^7^71 but no harm to the limit.) 
n — ' J n — 1 ^ — ' J 

We will derive the limit tr(A - A) -1 - tr(A -1 ). 

Lemma 4.2 After truncation and normalization, we have E|7£A _1 7 fc — (l + zm y (z))\ 2 < Kn' 1 for every z G C + . 

Proof. We have7fcA- 1 7 fe = -yl A ~ k 1 ~fkPk = l-£k, where A k = A-~f k -yl and fi k + (1 + 'y%A^ 1 'y k )~ 1 . Therefore, 
By (1.15) and (2.17) of Bai and Silverstein (2004), we have E^A"^*, - g(z)\ 2 = E\/3 k + zm y (z)\ 2 < Kn' 1 with 
g{z) = 1 + zm y (z). 
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Corollary 4.1 After truncation and normalization, we have E \~f k A. 2 ~f k — + zm y (z))\ < Kn 1 for every 

z e c+. 

Proof. By Cauchy integral formula, we have 

7feA l^^—i '* \i' k d( and g (z) = — & j- » 



2?ri 7| f _ z |=„/2 (C-2;) 2 27ri 7| C _ z | = „ /2 (£ - z) 2 

Then E (7* A~ 2 ~f k - £(1 + 2m a (z))j 2 < Kn' 1 follows from Lemma l4~2l 

Lemma 4.3 _For any subset U of {1, 2, • • • , n}, after truncating and normalizing, we have E |trA _1 A| < Kn' 1 
for every z G C + . Especially for every z £ C + , 

2 

E|tr(A- 2 A)| 2 =E i^ 7 *A- 2 7fc = O^ 1 ). 

Proof. We have trA" 1 A = ± E^ fceM 7j A _1 7 fc = ± E j/feew l^JklhPiPhQ) , where A jfe = A fc - 7^7* for j / k 
and /3fe(j) = (1 + ~]* k A~ k 7 fc ) _1 . We will similarly define Atjt and /3 fe ( ij ) for later use. Then we obtain 

E| tr( A- 1 A) 1 2 = E± E^ feiew 7ij A^fcjTfc! fti &i Ui ) £ E J2#fe2 e w 7i£ 2 A" 1 ^ 7 J2 ft a A- 2 to ) 

: ~ E(2) +E(3) +E(4)' 

where the index (•) denotes the number of distinct integers in the set jj'i, ki, j%, k^\. By the facts that |/3j| < ^ 
and ^ = ^(2), we have 

E(2) — J2j^keu E l7j A jfc 1 7 fc | 2 

< ££r E^, E|tr(T*AT fe 1 TAv fc 1 ) < < Kn-' 

E(4) = ^ Ej^fci^'a^fcaew S 7j! Ajih 7^ 7 32 A J2 i 2 7t 2 fti &i (31 }ft 2 A 2 (i2) 

where 

7j 1 A7 fcl = Pn A 1 ( J1 )7j 1 A 31 1 fcl 7fe 1 = ftiAiOi) [7^ ^^^7*, _ ^Miifci^ji A^ 1 k lfc3 7* 3 7* 2 A 3 - 1 fc lfe2 7 i . I ] 
= PjiPhiUi) Aj 1 j 2 fe 1 fc 2 7fc 1 — / 5 i 2 (jifcife 2 )7j 1 A J1J2fclfc2 7 J2 7j 2 A JlJ2fclfc2 7 fcl 
~Pk 2 (j 1 k 1 )7 J1 A J1J2fclfc2 7fe 2 7fe 2 A J - i:i2felfc2 7fe 1 

+/3fc 2 (jifci)/?j 2 Uifcifc 2 )7j 1 Aj 1J2fclfe2 7j 2 7j 2 Aj 1 j 2fclfc2 7fc 2 7fc 2 Aj 1J2fclfc2 7fc 1 
+/3fc 2 (jifci)/? J2 (jifcifc 2 )7j 1 Aj 1J2fclfe2 7fc 2 7fc 2 A J1 j 2felfc2 7j 2 7j 2 Aj 1J2fclfc2 7fc 1 
— A 2 ( J1 s :i )^ 2 y 1 fc 1 fe 2 )7j 1 A 3 - 13 - 2felfc2 73 2 73 2 A 3 - 1 ^ 2felfc2 7 fc2 7 fe2 A j . ii2feifc2 7^ 2 7 i2 A^ i;i2feifc2 7 fei j 



and 



Pi = bj - Pjbjtj = bj - bj€j + Pjb 2 e 2 



P 



b j(k) - Pj(k)bj(k)£j(k) - bj {k) - b 2 {k) e j(k) + &(fc)&*(*) e j(fc) 



(4.5) 



with bj — 1+g »y« A 7y , £j = 'YjA.j'Yj — £ , 7*A J 7 J , and bj^) and £j(k) are similarly defined by replacing A" 1 as A^ 1 . 
By the same manner, we can decompose 7* 2 A^ fe2 into similar 6 terms and then we will estimate the expectations 
of the 36 products in the expansion of ~/* l A / y kl (~/* 2 A-/ k2 )*. 

Case 1. There are at least 6 terms of A -1 . . := B's contained in the product. We shall use the fact 
that all /3-factors are bounded \z\/v < K. Then we can show that the term is bounded by 0(n~ 3 ). Say, for the 
product of the two 6-th terms, its expectation is bounded by 



KE\ ( 7 * x B 7j2 7 * 2 B 7fc2 7 * 2 B 7j2 7 * 2 B 7fcl ) ( 7 * 2 B 7jl 7 * x B 7fel7 ^ 1 B 7jl 7 * x B 7fca ) * 



< K E (7; i B 7j27 * 1 B 7fcl 7* 2 B 7j27 ; 2 B 7fcl ) E ( 7 * 2 B 7jl 7 * 2 B 7fc27 ^ l B 7jl 7 * 2 B 7fc2 



1/2 



Note that the factors 7j 2 B 7fc2 in the first batch and ( 7 * i B 7a , i )* in the second batch are exchanged positions when 
using the Cauchy-Schwarz for avoiding 8the power of the 7 under the expectation sign. 
Applying the formula 



E 



i=l j=l fe=l 



E (\ Yi \ 2 \ Z i\ 2 + YiZiYjZj + \EXf\ 2 YiZiYjZj) + ^ E|Xi| 4 |y,| 2 |^| 2 £)y fc Z» 



Y, { 2 ( n - 2)|Y £ | 2 |Y' fe | 2 + (n - 2)(|EZ 2 ! 2 + \EX 2 \ 2 )Y 2 Y?) + £ E|X,| 4 |V;| 4 E|^ | 4 



< 2/t(n-2) ( J2\Yi\ 2 I + max{E|X i | 4 E|Z i | 4 -2K}^|y i 



4 

* I 5 



where k = 2 for the real case and 1 for the complex case, Xi,Z k are independent random variables with mean 0, 
variance 1 and finite 4th moment, and further EX 2 = (and EZ 2 = 0) if they are complex, we will have 



|2 1 



E |(7i I B7i a 7? I B7* 1 7l a B7«7i a B T , *i)| = -EK^B^^B^^B^Jl -y* 2 BTB* 1:j 
< 5e(7* 2 BTB*7 J2 ) 3 + § X>Mt 1 / 2 B7,J 4 7; 2 BTB*7; 2 



< * 



1 \ 3 1 ™ 

— (BTB*T) + -3 I] |e-T 1/2 BTB*T 1/2 ei| 6 E|jr i: 



= 0(n~ 4 ) y 
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where e< is the standard i-th unit p- vector, i.e., its i-th entry is 1 and other p — 1 entries 0. In the last step of the 
above derivation, we have used facts that E|Xf J2 | < rjln max EE = o(n) and e-T 1/2 BTB*T 1/2 e 4 < ||T|| 2 /u 2 . 

By similar approach, one can prove that the expectation of other products with the number of B less than or 
equal to 6 are bounded by 0(n~ 3 ). 

Case 2. There are 5 B's contained in the product. We shall use the first expansion of j3j 1 and fij 2 and 
then use the bound bounded \z\/v < K for /3's. Then we can show that such terms are also bounded by 0(n~ 3 ). 
Say, for the product of the first term of the first factor and the 6-th term of the second factor, its expectation is 
bounded by 

Fl ( y P3iPk 1 u 1 )l3j 2 l3k 2 u 2 )l3k 1 u 2 k 2 )l3^ 1 (j 2 k 1 k 2 ) - b ji b k 1 (j 1 )bj 2 l k2 u 2 ) b k 1 U2k 2 ) b 'j 1 u 2 k 1 k 2 )^ x 
B 7 fcl 7* 2 B 7jl 7* t B 7fcl 7fc x B 7jl *y* h B 7fe2 )* I 



< K E 



| [p3lPk 1 { 31 )l3 ]2 Pk 2 {j 2 )l3k 1 {j 2 k 2 )l3 2 jl {j 2 k 1 k 2 ) - b 31 b k 1 (j 1 ) b ] 2 ^k 2 (j 2 ) b k 1 (j 2 k 2 ) b 'j 1 (j 2 k 1 k 2 )j 



0\ V2 



(7; i B7 fcl 7* 2 B7 J1 )| E|( 7 ; i B 7fci | ^B^j j < O(n^). 
Here, we have used the fact that each term in the expansion of 

( y P3lPk 1 (j 1 )l3j 2 l3k 2 U 2 )l3k 1 {j 2 k 2 )l3 2 j 1 [j 2 k 1 k 2 ) - b 3l b k 1 {jl) b 3 2 lk 2 U 2 ) b k 1 U 2 k 2 ) b ) 1 (j 2 k 1 k 2 )} 

contains at leat one e function which the centralized quadratic form of 7 . The use the same approach employed in 
Casel, one can show that the bound id 0(n -3 ). 

Case 3. There are less than 5 B's contained in the product. If the number is 4, we need to further expand 
the matrix A n in e H as A" 1 ^ - AT^^AJ^^y,), expand A" 1 = A" 1 ^ - Ar^ 7ji7 * i A7 I ^ft l(fa) in e J2 , 
and then use the approach employed in Case 2 to obtain the desired bound. 

If the number is less than 4, we need to further expand the inverses of A-matrices. The details are omitted. 
Finally, we obtain that 

E = °( 1 )- 

£ — ' n 

(4) 

Similarly, we have 



(3) 



Because tr(A 2 A) = -ptr(A X A), then we have 



E|tr(A~ 2 A)| 2 = E 



-y^7j*A 2 7 , 



37^k 
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The lemma is proved. 

Lemma 4.4 After truncation and normalization, we have 

tr(A~ 2 AA _1 A) = (m y (z) + zrn y {z)){l + zm y {z)) 

in L 2 uniformally for z 6 C + . 

Proof. SettrA- 1 (z 1 )AA- 1 ( 22 )A= ^E J#fe , 1 ^7*A- 1 (2i)7 fe 7;A- 1 (^ 2 )7 t = Q!+Q 2 where 

1 - 1 
Qi = — J2f*j A 1 ( z i)ljtl A 1 (^)7fe and Q 2 = — ^ 7 *A ^zi^^'A 1 {z 2 )~f t . 



By Lemma l4~2l and [131 we obtain E|Qi - (1 + 2(m («i))(l + zm y (z 2 ))\ 2 < Kn' 1 and E|Q 2 | 2 = o(l). We thus have 
E|trA _1 (2:i)AA" 1 (2 2 )A - (1 + + zm y (z 2 ))\ 2 = o(l). Consequently, because trA -2 (zi)AA -1 (z2)A = 

8rtrA ' 1(z ^ A " 1(z2)A , then we have E|trA- 2 (z 1 )AA- 1 (2 2 )A - £-g( Zl )g(z2)\ 2 = o(l). That is, 

trA~ 2 (zi)AA~ (^A = g(z 2 )g'(z 1 )m L 2 . 

By setting Zi — z 2 = z, we obtain tr(A~ 2 AA _1 A) = g(z)g'(z)m L 2 . 

Lemma 4.5 After truncation and normalization, we have tr(A _1 A) 3 (A — A) -1 = g(,z)tr((A _:L A) 2 (A — A) -1 ) + 
o p (l) uniformly for z £ C + . 

Proof. We have 

tr(A- 1 A) :i (A-A)- 1 = £-i £ 7*A~ 1 7 j 7*A~ 1 7 fe 7*(A — A) _1 A _1 7 t 
= p X) 7:A- 1 7 J 7;A- 1 7, 1 7:(A-A)- 1 A- 1 7t + ^ TlA-^^^A-S^^A- A)- 1 A- 1 7f 

= X 7gA _1 7 h 7*(A - A^A'St + Op(l) 

= g Wtr((A- 1 A) 2 (A - A)" 1 ) + g(z)± £ 7 ;A- 1 7 , 1 7:(A - A)- 1 A- 1 7t + Op(l) 

s=t 

= g ^)tr((A- 1 A) 2 (A-A)- 1 ) + 0p (l). 
Then by Lemma [4.41 we have 

tr(A _1 A) 2 (A — A) -1 = tr(A- 1 A) 2 (A)- 1 +tr(A- 1 A) 3 (A- A)" 1 

= trtA- 1 A) 2 (A)- 1 + ,g( 2 )tr(A- 1 A) 2 (A - A)" 1 + o p (l) 
(1 + zm y (z))(m y (z) + zm/ y (z)) 



Hence, we obtain the following lemma. 
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+ 0,(1). 



Lemma 4.6 After truncation and normalization, we have 

o iio i i . q ("i„(-z) + zm',(z))(l + zm„(z)) 
trA~ 2 (z)A + trA (z) (A A («)) + tr (A(z) - A)' 1 (AA (z)) = ~ yV "\ + o P (l) 

-zm y {z) 

uniformly for z £ C + . 
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